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ABSTRACT 

This study shows strategies which can be used to plan 
and implement the Canadian National Library*s bibliographic data base 
and the syi^tems which it is to support* The data base would be a 
subset of a national bibliographic data base which can be brought 
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software are analyzed, and a strategy is outlined for development of 
the bibliographic data base* (CH) 
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PREFACE 

In July 1973 the National Librarian awarded me a contract with the 
following terms of reference: 

"ifr. Duchesne will conduct a study of the organization and 
content of the Canadian National data base of machine* 
readable bibliographic records and the means by which this 
may Interface with other national and international data 
bases and machine-readable distribution services. The 
study will cover the use of various national MARC tape 
services as input to the Canadian bibliographic data base; 
the problems of converting other MARC formats to the 
Canadian MARC format; the dissemination within Canada of 
records received from foreign tape services to regional 
^ - ^eentrea and llbrsrlesr^hc laetfacdKn^^ 

tape services for national distribution; the need for 
changes and adaptation of foreign records for conqpatibility 
with the Canadian format and data base; as well as the 
organization and accessing of the data base." 

Contracts are not easily arranged between parties separated by thousands 
of miles and I am greatly indebted to the National Librarian, the Associate 
National Librarian, Miss Hope Clement - and to A.J. Wells and Mr. R.E. 
Coward of the British National Bibliography Limited, London, England * 
for the opportunity to undertake the contract. BNB released me for two- 
thirds of the study period. 
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his staff is gratefully acknowledged, in particular, information supplied by 
Mr. P. Wolters, Mr. J. Heilik and Mrs. A. Harvey. 

Wider afield I am grateful for information, and in many cases advice, from 

Mr. B. Stuart Stubbs (Chairman of the Union Catalogue Task Group), Mr. R. 

Stierwalt (Director, Ontario Universities' Library Co-operative Systems Project), 
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SUMMARY 

The present study is essentially a modelling exercise employing available 
figures and information with order-of-magnltude projections to show the 
broad strategies which should be followed In the planning and implementation 
of the National Library's bibliographic data base and the systems which this 
base Is to support over the next five years. This base Is a subset of the 
Integrally organized national bibliographic data base which can be brought 
together In the context of the projected Canadian Library and Information 
Retrieval Centre • 



EDP-based systems and services planned 
by the National Library (pp. 5-13) 

In the next two years the National Library will complete the full implementation 
of its EDP Canadians systems » and will go on to implement and/or enhance 
EDP systems in the areas of union catalogues and lists » acquisitions , selective 
dissemination of information » and - possibly^ on-line catalogue support for 
federal government and other libraries. Most of these EDP systems will centre 
around the Canadian National Union Catalogue (CANUC) EDP systems , which will 
cover both serials and non-serials and will provide human-readable union 
catalogues and lists - mm well as on-demand location service similar but faster 
than that already provided. In systems planning it is already accepted as a 
principle that new EDP applications should be developed as far as possible as 
modules of a single overall system. 

« 

Hardware and software (pp#14-27) 

Hardware availability is a major factor in design of any system with a sizeable 
on-line data base used many hours a day^ and it is of the first importance to 
National Library system planning to establish the likely future hardware position. 
Guaranteed access to hardware with suitable operating ^ communications and data 
base management software^ is a prerequisite for the implementation of an integrated 

i , 

on-line data base serving multiple applications Including the CANUC EDP system. 
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This Implies that an extremely high priority should now be accorded to 
the planning and Implementation of the Library and Information Retrieval 
Centre. The present position of the organizations principally concerned 
with the Centre Is briefly outlined (pp. 15-16), 

Software requirements for maintenance and use of an Integrated on-lln^ data 
base are analysed. In the context of the Library and Information Retrieval 
Centre, the National Library will wish to select an operating system and 
communications monitor likely to be most acceptable and useful to the Centre's 
participants. Provided that a data base management software package can be 
found to meet the Library's requirements within acceptable performance, cost 
and other parameters, there will be very significant economic and elapsed 
time benefits in making use of such a package. A preliminary survey of 
National Library data base management requirements and data base management 
software packages is reported. The conclusion of this survey is that existing 
software packages do exist which would substantially aid the National Library 
in the establishment of an integrated bibliographic information system, but 
that package selection is critically dependent upon the choice of host computer 
system. More detailed conclusions (pp. 22-23) are also drawn, leading to a 
series of recommendations (pp. 26-27). 

The National Library bibliographic data base ; 
basic strategy (pp. 28-42) 

The key to any integrated EDP system design for the National Library is the 
CANUC system, since this will have to process a larger volume of records than 
all the remainder of the National Library's EDP applications put together, 
when it becomes operational in 1977. The critical feature in the design of 
the CANUC EDP system is the means by which the input of some 1.7 million 
accession reports per year is to be achieved. An on-line input system 
facilitating the use of existing machine-readable data will be much more efficient 
than a batch input system, largely because such a system facilitates the use of 
existing machine-readable data - and such data will exist for over 88% of reports 
received, following initial build-up of the EDP file. 
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The economics of on-line input and storage in a hypothetical CANUC EDP system 
are examined in some detail, leading to a series of conclusions (pp. 33-36) • 
Economic conclusions are dependent on computer hardware being made available 
at, or near, cost. Principal conclusions are that the National Library should 
plan to use an integrated on-line data base containing CANUC, National Library 
acquisitions and in-process* and selected MARC records to support multiple 
applications. These applications include CANUC accession reporting, CANUC 
location service, CANUC publication, Canadiana, National Library EDP cataloguing. 
National Library acquisitions, and catalogue support services for Canadian 
libraries. Further, all potentially useful records and record data fields 
should be added to the initial on-line file which should be monitored; following 
monitoring low-use records and record data fields should be removed from the 
on-line file as soon as they have been identified. 

The outline is drawn of an integrated on-line National Library data base 
including CANUC, MARC, internal and authority files (pp. 36-38). The order-of- 
magnitude on-line file size is then computed on the basis of a hypothetical 
data base strategy, showing an on-line file size of under 1,200 million bytes 
for the first six years of system operation. For purposes of comparison it 
is noted that the national bibliographic data base of France will be stored on 
a computer installation with 2,400 million bytes of disk storage. Finally, thd 
basic strategy illustrated is coiq>ared with approaches adopted by similar 
projects elsewhere: national level projects in France and Sweden, and the 
Ohio College Library Center family of regional projects in the USA. The 
recommended strategy is very similar to the strategies adopted or planned in 
these projects. The high cost, size and long-term nature of these projects are 
noted as well as^ the fact that problems associated with bilingualism and 
geographic size will not allow Canadian development to be achieved *on the 
cheap ' . 

Further aspects (pp. 4 3-5 6) 

It is emphasized that the National Library data base and systems strategy will 
need to be developed within a changing national and international bibliographic 
EDP network context. Conclusions are quoted (pp. 44-45) from a survey of French 
and Swedish network development plans and from a modelling study designed to 
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throw light on the economics of alternative on-line catalogue service networks 
for Canada. Based on these conclusions It Is recomaended that the National 
Library, In co-operation with other libraries across Canada, should work out 
a plan for Canadian national blbllogrAphlc EDP network development. It Is 
noted that this recommendation accords closely with the vl€iws and recommendations 
of the Canadian Computer /Communications Task Force. Further, a number of 
Immediate tasks In the network planning area are Identified (pp.43-46). 

Next, the hardware facility Implications of data base and system strategy 
recommendations are examined. A computer configuration Is detailed. Illustrating * 
in the context of the Library and Information Retrieval Centre * the basic 
facilities required for adequate support of a National Library on-line Integrated 
bibliographic information system (pp. 46-49). 

The use and treatment of MARC and ISDS in the latter system are then outlined 
with the conclusion that (selected) ISDS, US MARC, UK MARC and French MARC 
records should be added to the on-line base. On the basis of current information 
it appears that foreign MARC tape distribution services within Canada might 
initially be limited to MARC records of the US, UK and France, and that the 
preferred format for distribution is the Canadian MARC format. In this connection 
the conclusions of a substudy are noted: these are that machine translation 
between the relevant formats should be acceptable for most practical purposes 
and that translation costs are unlikely to be excessive, although each format 
translation program will take some man-months to specify, program and bring 
to operational status (pp. 50-51). 

Finally, the possibilities are examined for publication of CANUC supplements via 
a CANUC EDP system. Hard copy publication is not recommended primarily for cost 
and handling reasons: microfiche or ultrafiche are preferred as forms of 
publication. Concerning patterns of publication. Indexes issued every six 
months and cumulating over five years are preferred. It is calculated that 
supplements with a continuously running bibliographic section and indexes of the 
preferred issue and cumulation pattern could be sold at cost to 300 libraries 
for an annual subscription of the order of $630 (microfiche) or $750 (ultrafiche) 
per subscriber. To help provide potential users with a complete picture of the 
possible economics of this service, suitable microfiche and ultrafiche readers are 
identified, and their purchase prices as at February 1974 are noted (pp. 51*56). 

11 
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RECOMMENDATIONS 
The Library and Information Retrieval Centre 

1. Planning for the Library and Information Retrieval Centre should be 
finalized as a matter of urgency. Speedy determination of the host computer 
system to be employed by the National Library Is a matter of the first 
Importance for National Library EDP system planning purposes. Guaranteed 
access to hardware with suitable operating » communications and data base 
management software. Is a prerequisite for the implementation of an 
Integrated on-line data base serving multiple applications Including the 
CANUC EDP system. Lack of such access would have severe negative liq)llcatlons 
for National Library EDP system planning and Implementation, would hinder 

the Integration of National Library EDP applications. and would critically 
weaken the economic feasibility and practicality of a CA.NUC EDP system - 
the centre-piece of National Library EDP systems to be implemented In the 
foreseeable future. (p«26) 

2. Liaison between the future participants In the Centre should be strengthened 
to assist planning, to ensure maximum compatibility and to help the sharing 
of Information and experience. Liaison concerning software development Is 
of particular importance. The various participants In the Centre may commit 
themselves to different operating systems, communication monitor and data 
base management software, hindering future operation on a common facility 
and hindering future Integration of files and systems. While a single set 
of software may not meet all the needs of each participant, liaison should 
assist agreement on at least common operating system and communication 
monitor software and on preferred programming languages. (p«26) 
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The Canadian national bibliographic EPF network 

3. The National Library » In co-operation with other libraries across Canada » 
should work out a plan for Canadian national bibliographic EDF network 
development. This plan should be phased and should show for each phase 
the extent of the network^ Including: 

- details of participating organizations 

- location and nature of hardware employed 

- location and nature of files accessible to the network 

- communications facilities 

- services and facilities offered by each participating 
organization 

This recommendation accords closely with the views of the Canadian 

Computer /Coomunlcatlons Task Force » exemplified in the following recommendations 

27 

of the report of the Task Force entitled Branching Out : 

1. '^Computer /communications (I.e. computer services by 

remote-access through communications facilities) should 
be recognized by governments as a key area of Industrial 
and social activity » and steps should be taken towards 
strengthening . . . co-ordination of Its development to 
the benefit of Canadian society." 

2. "in the formulation of national computer /communications 
policy a unified approach throughout Canada should be 
stressed as a key factor requiring close co-ordination 
between federal and provincial actions." 

3. '*In the area of federal responsibilities a Focal Folnt 
should be established within the government for co-ordination 
In the development 9 formulation and continuing evaluation of 
national policy In all matters pertaining to the field of 
computer /communications . " 
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Immediate tasks in the area of bibliographic EDP network planning 
include : 

- the strengthening of organizational arrangements and the provision 

of additional resources to support joint national /provincial planning 
concerning library and information service development. 

- Canadian National Union Catalogue system plauninis, for instance the 
manner in which libraries will report when the new CANOC EDP system 
is operational. While many libraries may wish to continue reporting 
exactly as they do now, some will wish to report in machine-readable 
form. The mode and tlmescale of this machine-readable reporting needs 
to be worked out in detail. 

- Planning of catalogue support services, both off-line and on-line. 
Joint national/provincial planning is required since some services 
will be provided by the National Library, some will be provided by 
inter-provincial and intra-provinclal projects (for example 
Ontario Universities' Library Co-operative System) and some may be 
provided by foreign or international projects, for example Ohio 
College Library Center. 

- Active participation in international bibliographic EDP network planning 
and development, for instance MUlC, ISDS and UNISIST network planning 
and development. (pp. 45-46) 

National Library data base strategy 

4. Given adequate and economic host computer hardware availability the 

National Library should plan to use an integrated on-line data base containing 
CANUC, National Library acquisitions and in-process files, and selected 
MARC records, in which the data of each bibliographic item is held only 
once. This base can serve many different applications including CANUC 
accession reporting, CANUC location service, CANUC publication, Canadlana, 
National Library EDP cataloguing. National Library acquisitions, and 
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catalogue support services for Canadian libraries. All potentially 
useful records and record data fields should be added to the Initial 
on-line file, and monitoring should be undertaken to Identify low use 
racords and data fields. Lov-use records and record data fields should 
be removed from the on-line file as soon as they have been Identified 
by the monitoring process. (p.35) 

This strategy accords with that adopted or planned by other major 
data base projects, notably the national data base projects of France 
and Sweden and the Ohio College Library Center family of projects In 
the USA. As with these projects elsewhere, full development of the 
Canadian national bibliographic data base and the services supported 
^y this base will be a major enterprise Involving considerable 
Expenditures over the greater part of a decade. Special Canadian 
requirements, such as those associated with blllngtiallsm and network 
development In a country of Canada's geographical size, will add to 
the already complex problems related to the development of a large 
national bibliographic data base and the services to be provided from 
this base. The gains from undertaking this type of development may be 
Inferred from the way in which the OCLC system Is being replicated in 
different regions of the USA on a self-supporting basis and from the 
national plans of France and Sweden, as well as the more direct National 
Library, CANUC and other benefits which may be realized in Canada, (pp. 41-42) 

National Library software development 

5. The National Library should specify in more detail the data base 

structure required for its purposes and should carry the study of data 
base management software packages to the benchmark testing stage. Data 
base management software, such as CAN/OLE, presently used or being 
developed by other Library and Information Retrieval Centre participants 
should be studied before final selection of software for benchmark 
testing. (pp. 26-27) 
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6. Following testing and selection of data base management software^ 
similar detailed analysis Including benchmark tests should be performed 
In respect of conmunlcatlons monitor packages. (p.27) 

7. Following these stages * and given adequate and economic host computer 
system availability « the National Library should proceed to implement 
an Integrated data base enq)loylng defined structures and chosen data 
base management and communications monitor software. (p»27) 
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ABSTRACTS OF ANNEXES TO THE REPORT 



A. Softvmrc and: flic organization (68p)* 

The purpose of this sub-study Is to determine whether any existing data 
base management (*DBMS*) and conmunlcatlons monitor software packages 
have facilities which satisfy the requirements of the National Library. 
The general characteristics of these types of software are described 
Including desirable features. An outline of the National Library data 
base Is then presented Including Its structure, levels of bibliographic 
record and desirable linkages. Linkages include Vertical* (e.g. collection, 
unit, analytic), 'horizontal* (e.g. copy and part), and 'parallel* (e.g. 
supplements, continuations, reprints, translations). Access methods are 
examined including compression code, word, and normalized entry access, 
and access points needed for different files and levels of bibliographic 
Item In the National Library data base. 

National Library requirements of DBMS are then considered In more detail 
and the results of a preliminary literature survey of 13 packages are 
reported. Two DBMS and associated communications monitor packages which 
(on paper) satisfy National Library requirements are reported In more 
detail. Finally, a series of conclusions and recommendations are drawn: 
these are Included In Chapter IV of the main report. 



B, Model outline of Canadian National Union Catalogue EDP system (80p) 



Following presentation of background facts, statistics and assu]q>tlons, 
assumed CANUC EDP record contents and layout and access methods are presented. 
Average record else and file size growth are computed employing stated 
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assumptions. Data base software assumptions are stated and terminal 
numbers and unit costs for accession report Input and for location 
searching are computed. An Illustrative computer hardware Installation 
Is presented and It Is noted that a number of conmerclal bureaux have 
large enough Installations to support - at least temporarily » say In 
early testing stages - a CANUC EDP system (bureau economics are not 
examined). Conclusions concerning the CANUC EDP system are then 
presented. 

The remainder of the annex deals with publication of CANUC catalogues 
employing the EDP system. Assumed volumes, layouts and alternative 
publication patterns are outlined, followed by a description of _ „ 

publication media and Computer Output Microfilm ('COM') systems. 
Suitable microfiche and ultrafiche readers are identified, and subscription 
costs for assumed EDP system-produced catalogues are computed. Finally, 
a possible Voice Answerback telephone location service is outlined. 

• Canadian on-line cataloguing service model (75p) 

The purpose of this study is to throw light on the economics of national 
versus regional provision of on-line catalogue support services. 
Costings are based on Ohio College Library Center 1971/72 costs with 
networking costs provided by the Computer Communications Group of the 
Trans-Canada Telephone system and Xerox of Canada Limited hardware 
costing of an OCLC configuration. National versus regional costs are 
computed for two networks chosen to be approximately the same size as the 
OCLC network at the time of study work, and approximately twice this size. 
The main conclusions are reported in Chapter VI of the main report. 
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D, Pattern of use In Canada of Canadian MARC t 
foreign MARC and ISPS records (28p) 

The purpose of this annex Is to summarize known requirements of Canadian 
libraries for MARC and ISDS records. The present extent of the Canadian 
MARC record distribution service and its future prospects are described. 
Known use and Intended applications for Canadian MARC tapes are summarized , 
Including the responses to a poll of Canadian MARC pilot project recipients 
concerning frequency and types of service required, Canadian MARC format 
development plans and service plans are noted, Canadian MARC pllbt project 
participants* format preferences and choice of tapes and distribution 
frequency are outlined. 

Tables show Canadian MARC pilot project participants' known applications » 

» 

the Canadian MARC format development schedule » participants* hardware » 
and an outline of the topics suggested for Inclusion In the participants* 
reports at the end of the pilot project, 

E, Projected availability of MARC ai?4 ISDS records 
of prime Interest to Canada (30p) 

Availability of these records Is relevant to the overall study because 
they can be used both for catalogue support purposes and to reduce data 
Input costs; If the records are added to the on-line data base their 
number and storage requirements are of importance In data base design, 

i 

Based on information supplied by the national and international agencies 
concerned^ annual production of current records is projected for all ISDS 
records and MARC records from the following countries: Australia » Canada » 
France^ Germany, United Kingdom, and the United States, Projections are 
shown for each service both for the five calendar years commencing 1st 
January 1974, and the five National Library fiscal years commencing 
1st April 1974, Retrospective records projected to be available in the 
period 1974-78 are detailed separately, 
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Based on stated assumptions the number of current MARC and ISDS titles 
for the same group of services Is calculated , giving one total figure 
for ISDS and MARC services of France, UK and the USA, and one grand total 
figure for the whole group. These two figures of current titles are 
projected for the thirteen year period to 31st December 1986; the period 
expected to cover the first ten years of life of the CANUC EDP system. 



P. Canadian National Union Catalogue; projected volumes 
of accession reports, titles reported and location 
requests 1974-78 (I4p) 



These projections are of fundamental importance to National Library data 
base and CANUC EDP system design. National Library union catalogues are 
briefly described. CANUC accession report statistics are projected to 
31st March 1979 based on historical statistics commencing 1st April 1960. 
Statistics are divided into the following groups: "non*-8erials'\ "serials 
other than newspapers" and "total" - the latter being the combination of 
the two former classes. Newspaper accession reports are not included 
because the volume of current reports is very small. CANUC titles reported 
are projected in the same groups over the same timescales. Location 
requests are projected over the same timescales employing the same groups » 
with the addition of newspaper location requests which are shown separately 
and added into total figures. Lastly^ a suofltnary graph is presented of 
historical and projected statistics for all material for the ten National 
Library fiscal years to 1978/79. 



G. Comparison of MARC monographic formats ; 
Canadian, US^ UK and Intermarc (54p) 

This annex presents a comparative table of content designators of the 
monographic communication formats indicated in its title » and draws conclusions 
based on this table and the operational experience of the British National 
Bibliography Limited, London, England. 
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H. France and Sweden: plant for the development 
of national bibliographic EDP netvorke (29p) 

This annex records infornatlon obtained from visits to the countries 
and from the literature » and draws overall conclusions which are quoted 
In Chapter VI of the main report. The section concerned with France 
sketches the organization of libraries In France » and describes both 
the Bureau pour l*automatlsatlon des blbllothSques and the remaining 
three years of the sixth xkatlonal five-year plan for library automation. 
A bibliography of relevant literature Is provided. The section concerned 
with Sweden Is similarly structured. Draft documents were circulated to 
persons In France and Sweden and the present annex incorporates changes 
^suggested bxJ*ei^piytjf©iia*„ . . .. ........ ... 
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I. INTRODUCTION 



Definitive planning of the organization, structure, and modes of access 
to be employed in a machine-readable data base requires precise knowledge 
of the following: 

1. Services and outputs to be provided from the data base, including 
details of transaction voliimes and response times over the life of 
the base. 

2. Inputs to the data base including not only crude volumes, but also 
the degree of duplication of records and of fields within records. 



3, Hardware to be employed. 

It commonly takes about 3 years to bring a large EDP application to 
operational status and the system installed commonly has a life of about 
7 years. From this it is seen that data base planning has to look some 
10 years into the future. Clearly there is a high degree of uncertainty 
in such a long look ahead, conxpounded by the fact that the major design- 
detctrmlning factors noted above represent at this time questions without 
answers adequately defined for definitive data base design purposes. 



In these circumstances the present study is essentially a modelling exercise 
employing available figures and information with order-of-magnitude 
projections to show the broad data base and system strategies which should 
be followed. This point is emphasized: the purpose of the report is to 
help determine strategy. Matters of detail are included insofar as they 
are pertinent to strategy: many matters of detail are not treated because 
they are not critical in systeoii design. 

o 25 
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II. THE CONCEPT OF A NATIONAL BIBLIOGRAPHIC DATA BASE 



A. "BIBLIOGRAPHIC DATA BASE"; A DEFINITION 

"Bibliographic" is used in this report in the sense of "pertaining to 
library materials". These materials include monographs^ serials, theses, 
maps, music and audio-^visual materials. This list is not exhaustive and 
in the medium or longer term it may be advantageous t6 include documentation, 
libraty^Mid archive material EDP records in a single data base system. 

A "data base" for the purposes of this report is "a sizeable collection 
of computer -readable data organized for use". The base may comprise a 
number of files held on a variety of storage media. For example, some 
organizations with disk data bases and the need to publish large volumes 
of records maintain a separate formatted output file to save on publication 
processing costs. In this case, multiple copies of a record exist for a 
single item: a master disk record (not necessarily held as one integral 
record on disk), at least one copy of this record on tape for regeneration 
purposes, at least one formatted output record, and at least one copy of 
the formatted output record for regeneration purposes. 



B. THE POTENTIAL NATIONAL BIBLIOGRAPHIC DATA BASE 

The Treasury Board EDP master plan^ projects the creation of a Library and 
Information Retrieval Centre under the custodianship of the National Librarian. 
Principal departments recommended to participate in the Centre are' National 
Library; National Research Council (National Science Library); Public Archives; 
National Museums; Information Canada; Defence Research Board; Agriculture. 




- 3 - 



The National Library and National Science Library are envisaged as 
"the nucleus of a service-wide application centre for Information 
retrieval" taking Into account the fact that these two libraries have 
"given rather careful study to the establishment of a joint computing 
facility which would meet, eventually, the needs of all government 
libraries in the information retrieval area" • 

At the present time the departments which will participate in the Centre 
maintain separate files, and it is not very meaningful to refer to the 
totality of their machine readable files as a single data base. Insofar 
as their existing files of machine-readable bibliographic records are 
large enough to merit the term bibliographic data base", they exist as 
separate data bases. However, when the files are processed on a single 
Installation it will be possible to Integrate them into a single 
organized system. When this has been done it will be meaningful - with 
one qualification - to describe these machine-readable files relating 
to library materials as "the national bibliographic data base". 

The qualification relates to the medium or longer term in which an 
on-line national bibliographic network is envisaged. At that time it 
will be possible to integrate machine-readable files at remote locations 
into a single organized system. The national bibliographic data base 
will then comprise all the accessible bibliographic files at the nodes 
in the national network. 

From these observations it is clear that "the national bibliographic 
data base" does not at present exist in an integrally organized form. 
Further, when the integrally organized base comes into existence, its 
initial scope will be much more restricted than its likely eventual 
scope. The scope of the data base considered in this report is indicated 
in the next section. 
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C. SCOPE OF DATA BASE CONSIDERED IN THIS REPORT 

This study is concerntd with the development of the National Library 
bibliographic data base over the next 5 to 10 years* This may be 
regarded as one of the necessary steps in data base planning for the 
Library and Information Retrieval Centre: a realistic approach is to 
consider the plans of each of the proposed participants in the Centre 
and use the resulting information to consider to what extent these plans 
can be integrated. 

The relationship of the National Library data base to the national 
bibliographic data base and its associated non-bibliographic files can 
be depicted diagrammatically. Figure 1 below depicts the situation in 
which the Library and Information Retrieval Centre has been set up and 
holds the national bibliographic data base. 



Figure 1 : Diagrammatic representation of the relationship^ between the 
National Library bibliographic data base and the national 
bibliographic data base 
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- bibliographic files relating to library t documentation and archive 
material 

- non-bibliographic files » for example mailing lists » patron files » 
vendor files 
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III. EDP-BASED SYSTEMS AND SERVICES PLANNED 



BY THE NATIONAL LIBRARY 



DESCRIPTION 

The ftinctlon of the National Library data base Is to support the EDP 

operations of the Library. Existing, projected and likely EDP systems 

and projects are noted below, grouped Into system areas. This Information 

4 

Is dravn from plans endorsed by the Library's Committee on Automation 
and by the Associate National Librarian In January /February 1974. These 
plans are, however, subject to Treasury Board approval and to the 
availability of the hardware, staff and other resources needed for their 
realization. The general philosophy behind these plans Is encapsulated 
In the first recommendation of the Report of the Systems Development 
Project^: 

"An Integrated bibliographic Information system for the National 
Library be developed comprising all the operations of acquisition, 
cataloguing, Canadlana , serials control, the Union Catalogue and 
union lists which permit handling by electronic data processing. 
The Integrated system will operate in a combination of batch and 
real-time modes. 



1. Canadlana 

Plans in relation to Canadlana have been announced in National Library 

7 8 
News November-December 1972 and January 1974 • The plans call for: 

(a) Continuation of the EDP-based cataloguing and publication system 
for monographs; extension of this system to cover firstly serials 
and government documents and secondly^ audio-visual materials. 
Products include the national bibliography Canadlana and a proof 
service to libraries for assistance in their technical processing. 

29 
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(b) Use of the EDP-based Canadlana system to produce National 
Library catalogue cards for the library's Canadlana (and 
non-*-Canadlana) Intake. 

(c) For Canadian serials: production of machines-readable records 
conforming to ISOS standards for submission to the International 
Serials Data Centre, Paris. 

(d) For government documents: provision of output from the National 
Library data base to a government documents Indexing subsystem - 
almost certainly that developed at the Ualverslty of Guelph. The 
possibility of producing the Information Canada Dally Checklist 
from Canadlana records Is presently being studied by a task group. 

(e) Production and distribution of Canadian MARC records. 

. (f ) Retrospective conversion of all records for Canadian books published 
before 1949. First priority is given to all records for the 
bibliography 1867-1900, in order to publish a bibliography covering 
these years. 

(g) Eventually, following investigation, the Preserved Context Index 
System (PRECIS) may be lnqplemented to provide a subject index to 
Canadlana and to provide additional subject access to MARC records. 

2. National Library cataloguing 

(a) National Library Canadlana cataloguing has been dealt with above 
under the heading ot Canadians . NL non-Canadiana cataloguing is 
to be progressively transferred from manual to EDP procedures 
within the next three years. The EDP procedures are those presently 
employed by the EDP'-based Canadlana system. ^ 
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(b) Participation in Ontario Universities* Library Co-operative 

System (OULCS) has been arranged for experimental purposes • It 
is probable that the system will be used to catalogue a quantity 
of foreign books. 

* 

3. SDI 

SDI services are currently provided in co-operation with the National 
Science Library against the following magnetic tape services: 

US MARC all subjects 

ERICTAPES - education 

SSCI Social Sciences Citation Index 

It is proposed to search additional tapes: 

Canadian MARC ) 
Psychological Abstracts ) 

Historical Abstracts 

Further tapes are under consideration: PREDICAST (marketing information) 
Sociological Abstracts; UK MARC tapes; French MARC tapes; PASCAL (Centre 
National de la recherche scientifique - Bulletin signalStique). 

Concerning SDI, the following recommendations of the Federal Government 
Library Survey are relevant: 

(a) "Priority be given to the development of CAN/SDI services in the 
social sciences and humanities to parallel services provided in 

Q 

science and technology" • 

"CAN/SDI be expanded to include all available tape services » 
social sciences and humanities as well as scientific and technical » 
and that the service and its training programs be made available 
to all federal libraries through the Library and Information 
Retrieval Centre"'''^. 

31 
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4. Canadian National Union Catalogue (CANUC) 

Policy with respect to this catalogue and a "New Automated Union 

Catalogue of Books" was announced In National Library News . Janunry 

8 , 
1974 • The National Library Is to start a new EDP-based union 

catalogue system and will close the existing Ihilon Catalogue of Books 

CUCB*) as soon as the new system Is operational. Following editing » 

the UCB will be published » probably In microform. The new system will 

provide the facility of publishing the catalogue at Intervals. 

Access to the new EDP catalogue within the National Library will 

probably be on-line. 

At a later date the CANUC EIXP system may be developed to provide 
statistical analyses of the Items reported to CANUC; In particular » 
the resources survey of university library collections In Canada could 
be kept up to date In this way. 

5. Union Lists 
(a) Serials 

As noted In National Library News > January 19749 "It Is not possible 
to make any decisions In this area at present because a great deal 
of work Is In progress"^^. If the Interim reconnendatie^ of the 
Canadian Union Catalogue Task Group are accepted » the National 
Library will "proceed quickly with the development of a union list 

of periodicals in the humanities and social sciences as a first 

12 

priority » and of newspapers as a second priority"^. The latter 

I 

recommendation accords closely with the reconsMndation of the Federal 
Government Libraries Survey that "The National Library give priority 

to the coBpilation of the computerised union list of serials in the 

13 

social sciences and humanities" . 
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If the recomnendatlons of a recent report are accepted, the 
Canadian Union Serials List (CUSL) will be "produced in book 
form at least every two or three years in toto, plus an accumulated 
supplement at least every six months"^^. Also the National Library 
and National Science Library vitl "formally organize to jointly 
produce and maintain a single Canadian Union File of Serials (CUFS) 
covering all subjects"^^ and these libraries will "participate in 
the proposed CONSER*Project"^^. Work in co-operation with the 
National Science Library is proceeding in the serials area. 

(b) Other material 

Following the production of the Canadian Union Serials List, 
further union lists are envisaged, but these have not yet been 
accorded the status of definite EDP-based projects in tShe current 
development schedule^: ,„ / 

Union catalogue of audio-visual materials 
Union catalogue of materials for the blind 
Union list of newspapers 



6.. Integrated system support 

A machine -readable authority file for government haadings will form 
part of the Canadiana system for serials and government documents to 
be operational in November 1974. It is planned to extend this file 
to all types of heading including personal, corporate, conference, and 
series headings. The authority file will include headings in Canadiana. 
National Library cataloguing - and in course of time - CANUC, and National 
Library acquisition and serials control files. If a recommendation of 
the Federal Government Library Survay is accepted "the authority file 
for personal and corporate authors" will be published and frequently 
updated^^. 
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* Co-operative project for the CONversion of SERlals 
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Creation of a machine- readable authority file is tentatively scheduled for 
1975/76. Also pertinent to Integrated system support Is a proposal to 
investigate data base management and communications monitor software. 



7. HARC and catalogue support services 

18 

The National Librarian has announced (January 1974) : \ 

**l]ext month » the National Library will begin to produce Canadian 
MARC tapes, and ths library Intends to exchange these with other 
countries In return for their tapes, as well as to purchase other 
tapes from various countries and will be able to provide a machine^* 
readable service In a batch mode, providing selected records on 
magnetic tape at cost to Interested libraries and bibliographic 
centres. Libraries cotxld subscribe to all the records from a 
country, to records selected by categories, or to specific records 
selected by numbers such as the LC Card Number or the International 
Standard Book or Serial Number. This will form an off-line 
cataloguing support service which will provide subscribers with 
machine-readable records for use in their own systems. 

**Flr8t priority is to be given by the National Library to implementing 
the new automated system for the union catalogue and to developing 
the bibliographic services using MARC tapes. The second priority is 
to develop for federal government libraries a cataloguing service 
which may be in the form of a centralised, cooperative or shared 
cataloguing service with the possibility of the system being extended 
to other libraries in the country which wish to participate. Next 
year, the National Library will investigate the best means of 
providing cataloguing support to the federal government libraries.** 
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8« Acquisitions 

(a) All materials 

In due course the National Library will develop an EDP-based 
subsystem supporting selection/ ordering and claiming for all 
materials • The subsystem will Interface with cataloguing and 
catalogue support subsystems: bibliographic Information captured 
In the acquisitions process will be available for use In the 
cataloguing process. Conversely » machine-readable data from 
CANUC and MARC files will be used In the acquisitions process. 
A centralized acquisitions system for all Federal Government 
libraries Is a possible later development. 

(b) Serials control 

Serials control automation will be concerned with the creation and 
maintenance of a serials control file against which Incoming Issues 
may be checked and which will produce various lists of serials 
holdings as well as Immediate Information on the receipt of a serial 
Issue by the National Library. Claiming » binding, renewal and 
routing ftmctlons may be Included In this system, which may 
eventually be extended to Federal Government libraries* 



9. Multilingual Blblioservlcg 

The automation requirements of the Blblloservlce are not yet fully defined. 
When operational, the Biblio«ervice will rotate book collectlomln other than 
the official languages among public and regional libraries; these will. 
In turn, make them available to interested people in the areas they 
serve. Follo%ring more detailed investigation, an EDP system may be 
devised to support this operation. 
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10. Directories and mailing ll«f 

It l8 planned to study the feasibility of employing EDP means to 
create, maintain and publish various Canadian library directories 
presently prepared or being considered by the Library Documentation 
Centre. The various mailing lists required by the National Library 
might be produced from the file used to produce the directories. 



Canadian Book Exchange Centre 

It Is planned to Investigate the feasibility of an EDP system to 
create 9 update and distribute lists of surplus materials. 



IMPLEMENTATION SCHEDULE 

Figure 2 below shows the tlmescale of the implementation of the systems 
and projects outlined above. It has the following legend: 



r \ Commencement of new or enhanced system 

^ or service 



Commencement as noted above Is dependent on 
or closely linked with some other system or 

service 



Service or system continues beyond tlmescale 
of chart 
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IV. HARDWARE AND SOFTWARE 



A. HARDWARE 

3 

Although the National Library Is comaltted to participate In the Library 
and Information Retrlev&l Centre » the plans have not reached the stage 
at which the future hardware of the Centre » and the tlxoescale of the 
availability of this hardware, are determined. 

Hardware availability Is a major factor In the design of any system^ with 
a sizeable on-line data base used many hours a day. If the system has to 
be operated for a significant proportion of Its life In a bureau environment 
at bureau prices, this Is likely to have a considerable Intact on systems, 
procedural and data base strategies. For example, It may not be economic 
to maintain a very large data base on-line for all normal working day hours 
In a bureau environment. In this case It may be necessary to carry out 
more processing off-line. 

It Is therefore of the first Importance to National Library system planning to 
establish the likely future hardware position. This implies that an extremely high 
priority should now be accorded to the planning and Inqplementatlon of the 
Library and Information Retrieval Centre. Important steps will be to: 

1. Set up a study with appropriate terms of reference, and controlling 
and liaison arrangements 

2. Study the requirements of the participants In the Centre 

3. Report with recoianendatlons 

4. Consider the report and reconmendatlons 

5. Decide policy and Implementation plan 

6. Proceed with Implementation 
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These steps will not be accomplished overnight, and the future participants 
In the Centre will not be able to hold back their system development to 
await decisions concerning the Centre. In this situation there will be 
considerable advantages In strengthening liaison and co-ordination between 
the participants to ensure that their developments are, and remain, as 
compatible as possible. The software aspects of this argument are developed 
In more detail In later sections concerned with software. Present hardware 
compatibility Is generally Indicated In the two tables below. 

Table 1 ; Main computers presently used by the chief future participants 
In the Library and Information Retrieval Centre 



Chief participants as listed 
In EDP Master Plan 


Computers^*) 


Major Operating Systems 


National Library 


IBH 360/AO 
IBM 360/50 


OS/MVT 


National Research Council 
(National Science Library) 


IBM 360/67 


TSS 


Public Archives 


IBM 360/50 
IBM 360/40 


DOS, transferring to OS 


National Museums 


IBM 370/135 
IBM 360/85 
IBM 360/50 

PDF 11/05 


OS/VS 
^ OS 


Information Canada 


IBM 360/65 
RCA Spectra 7045 


OS 


Defence Research Board 


V^rlan 620L 

IBM 370/145 
IBM 370/165 


OS 


Agriculture 


IBM 360/50 
IBM 360/40 

IBM 370/165 

Unlvac 1108 


) DOS, transferring to 
) OS 

DOS and OS 

Exec 8 



Source: Canadian Library Directory* 1974 and direct contact of the organizations. 
Notes ; (a) Most of the computers shown belong to service bureaux. 
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Federal Government libraries wcmld also be prospective users of the 
Centre, Table 2 below shows the present position with regard to these 
libraries • 

Table 2 : Computers used in library automation in federal government 
libraries 



\4W|]l|IUkCt 






IBM lAOl 


1 


1 


360/30 


1 


1 


360/AO 


2 


6 


360/50 


2 


2 


360/65 


3 


8 


360/67 


1 


5 


CDC 6600 


2 


1 


Unlvac 


1 


1 


Ferr ant 1 /Packard 


1 


1 


CDC 3100 


1 


1 




Totals 15 


31 



Source: Federal Government Library Survey; report of the Services 
and Systems Team . , Ottawa, National Library* 1973, p. 267 
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SOFTWARE 



At noted in Annex A» four categories of software are required to maintain and 
an Integrated on-*llne data base. Including terminal access to this base: 

1. User application programs 

These provide for the detailed processing of records. Exainples of 
this processing are validation of Input records and formatting of 
records for printing. Utility programs for sorting, merging and 
dumping are Included In this category. 

2. Data base management 

This software maintains the files and their Indexes, and provides user 
application program and communications monitor access to these. 

3. Communications monitor 

This software "acts as a three-way Interface between messages from 

teleprocessing terminals, the application programs which determine 

how the messages are to be processed, and the operating system. It 

controls the queuing, editing, and dispatching of the messages, the 

centralised Input/output operations Involved, and the rolling In and 
19 

out of programs" . 

4. Computer operating system 

This provides overall control of operations ^Including allocation of 
core storage and peripheral devices to programs. 

Figure 3 below Illustrates the Interrelationships of these categories of 
software In a data base/data consninlcatlons system. 
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Figure 3 : Components of a data base/data communications system 
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C. SOFTWARE OPTIONS 

In the context of the Library and Information Retrieval Centre, the National 
Library will wish to select an operating aystem and a comnunlcatlons monitor 
likely to be most acceptable and useful to the Centre's participants. It Is 
hardly conceivable that this choice could Involve modification of a 
manufacturer's operating system or writing comnunlcatlons monitor software. 
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The position with regard to data base management software is also clear- 
cut, provided that an existing package can be found which meets the Library's 
requirements within acceptable performance, cost and other parameters. 
The economic and elapsed time benefits of making use of an existing data 
base management system (DBMS) are indicated in the following quotation from 
Annex A: 

20 

"The University of Chicago estimated that the development of 
a sophisticated access method (the basis of a DBMS) would 
require 86 man months, and would cost $151,000. Note that an 

access method is not a full DBMS, and that a satisfactory 

21 

product was not guaranteed. MRI Systems Corporation quotes 
studies made by a number of management consulting firmp which 
price in-house development of a sophisticated DBMS, such as 
System 2000, at $1.5 million to $3 million. Purchase of a . 
proprietary DBMS package ... is obviously less expensive than 
in-*house development." 

Factors in the selection of data base management software packages are 
discussed below. 



D. FACTORS IN THE SELECTION OF A DATA BASE 
MANAGEMENT PACKAGE 

The list of factors below summarizes the salient factors applicable in 

the case of the National Library; it is drawn from a fuller analysis contained 

in Annex A: 

1. Hardware and operating system 

The package must operate on the hardware available under an acceptable 
operating system* 
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2. Facilities 

The package must provide the basic facilities required. In the present 
case It must be capable of maintaining files of sizes of a few hundred 
up to 10 million variable length records with a varying number of 
variable length fields. Access via multiple record keys must be possible 
(e.g. LC number 9 ISBN9 author and title compression codes) with provision 
of pre-search statistics. It must be able to store portions of the data 
base on sequential storage devices » such as magnetic tape. Facilities 
must be available for linking records and parts of records as required. 
Linking and association requirements are analysed In some detail In 
Annex A; they are rather more complex than In the average data base. 
In addition 9 facilities for regenerating data In the event of hardware 
or system failure are mandatory , and there should also be facilities 
to prevent access by unauthorized persons to classified data. 



3. Dependability and transferability 

The package must be operational and proven on as many Installations as 
possible. It should be maintained and updated by a reputable and reliable 
software development organization. Updating should also ensure that 
package users can make use of new versions of the same operating system 
and new models In the same hardware range. 



4. Flexibility 

The package should allow reorganization of the data base, creation and 
deletion of files, and amendment of record structures without the necessity 
for package reprogrttDnlng. 
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5. Performance and cost 

The package should not be unduly costly to rent (or buy) and Inplement. 
Once lii?>leinented it should operate within acceptable core storage limits 
and should maintain indexes and files within acceptable storage limits 
and costs. In operating it should provide timely response to user 
requests without excessive "housekeeping" costs. Housekeeping In this 
context refers to operations such as file reorganisation, index 
maintenance and updating, generation of links between records and parts 
of records. There is In practice often a trade-off between response 
time and housekeeping cost: faster response time may be achieved at the 
expense of increasing housekeeping overhead. Alternatively, housekeeping 
costs may be lowered at the expense of accepting a slower response time. 



E. DATA BASE MANAGEMENT PACKAGES 

The list of factors just presented shows that package selection is not a 
8impl| task: it involves weighing many factors. Only limited conclusions 
can be drawn on the basis of published literature and specifications. All 
that can be decided on the basi£ of these is whether there appear to be 
suitable and acceptable packages and which packages should be tasted out 
in practice. Empirical test is mandatory in order to establish factors 
such as response time and storage requirements in a practical context. 

The two tables below are drawn from Annex A; they show the range of packages 
given preliminary study, and summarize the findings. Within the limited 
time available it was not possible to study every package in existence; 
omission of a given package from the table does not mean that this package 
has been studied and definitely rejected. The objective was to carry out a 
preliminary survey of the main available and eligible packages. 
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CAN/OLE could not be studied in detail at the tine of this preliminary 
work since it was in the proceaa of developnent at the National Science 
Library, and documentation concerning its use and capabilities was not 

available; it should definitely be included in the next round of studies. 

ISIS shQuld also be considered, especially if it is selected by National 
Museumil ISIS is presently used by the Swedish LIBRIS project and was 
developed at the International Labour Office, Geneva* It is presently 
supported by the International Development Research Centre in Ottawa 
and is being developed to enable it to handle If 1^8^^ data bases. 



STAIRS is an information retrieval package rather tht;^ a comprehensive 



more detail, especially if selected by National Ibiseuma. 



F. CONCLUSIONS OF SOFTWARE STUDIES 

The conclusions of work done in the software area are as follows: 

1. Data base management systems and connunlcations monitors do exist 
which would substantially aid the National Library in the establishment 
of an integrated bibliographic information system. 

2. Purchase or rental of a proprietary data base managemant software 
package could save the National Library both considerable development 
time and expense. 
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3. Software package selection is critically dependent upon the choice 

of host computer system: most packages will run on IfiM 360/370 machines 
under OS/MVT or OS/VS. There are packages for other machines, but 
the choice is very much more limited. A first or early step must be 
to establish the choice of host computer system. Since the host system 
will presumably be that of the Library and Information Retrieval Centre, 
this conclusion points to the need for speedy finali«ation of the plans 
of the Centre. 

4. Even if planning of the Library and Information Retrieval Centre is for 
some reason delayed, it is to the interest of all future participants 
to maintain close liaison. Principal alms in thii liaison would be: 

(a) establish the present software and hardware commitments of the 
future participants ^ without this knowledge effective planning 
and liaison is difficult 

(b) assist planning of the Centre 

(c) ensure that new hardware, software, program language commitments 
are as compatible as possible 

(d) exchange information and! experience. 

With regard to (c) and (d) above it is pertinent to note that the 
National Museums have currently reached the benchmark testing stage 
for certain data base management packages. The possibility exists 
that the National Library, National Science Library and National 
Museums could ultimately use at least the same operating system and 
communications monitor and possibly also the same data base 
management system. 
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Table 3 : Llmlf tlon« of exlstlm; d>t> base inanag«..ent software packages 



DBMS 


Requlrementf^ not 
satisfied 


Comments 


AEABAS 


11 


Meets most requirements 


CAN/OLE 


1, 2 


Will be studied 


DB(fMP 


1, 2 


Only rusis on IBM 360/70 under DOS. Literature 
not studied, additional limitations may be 
present 


Disk Fortg 


1 


Burroughs software product. Literature not 
studied; additional limitations may be 
present 


DMS-1100 


1, 3, 4, 9, 10 


Unlvac software product. Still in develop- 
ment stage. Literature not studied; 
additional limitations may be present 


EDMS 


1, 4, 11 


XDS software product 


GIS 


1. 6, 7, 9, 10 


IBM software product. Restart and recovery 
not provided. Must run with IMS to obtain 
hierarchical and network data structures 


IMS 


1, 7, 9, 10, 11 


IBM software product. Excessive memory 
requirements. Lacks data Independence - 
change in record format means changes to 
programs accessing the data. No report 
generation facility 


HARS-III 


1 


CDC software product. Literature not studied; 
aaaitionax limitations may he present 


METABASE 


1, 11 


IBM 360/370 under OS or VS only. Meets most 
requirements 


SPIRES 


1. 3 


TRM 360/370 onlv. Not* a nirnnir^^t'Mirv nan^aom* 

software specialists would be required for 
maintenance. No report generator facility. 
Literature not studied; additional limitations 
may be present 


System 2000 


3, 9, 10 


Programs must be precompiled to convert calls 
Ndt callable from PL/I 


Total 


4, 7. 9. 10. 11 


Threaded list approach. Utilities to 
reconstruct data base from log tapes not 
supplied 



* Meaning of numbers given in Table 4 
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Table 4 : Data b««e management software requlreaenf 



Number 


Meaning 


1 


Operate on a variety of con^uter makes 


2 


If IBM, then under OS MVT or OS VS 


3 


Calls from high level languages 


4 


Presearch statistics » multi-key search 
and retrieval 


5 


On-line updating 


6 


Variable length records, variable number 
of variable length fields 


7 


Empty fields do not require space 


8 


Growth of data base Is open ended 


9 


New files added to data base without 
reloading 


10 


New logical relationships and record 
structures without reloading 


11 


Index all types of storage devices 
(Including sequential) 
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G. RECOMMENDATIONS CONCERNING HARDWARE AND SOFTWARE 

1. Planning for the Library and Infomatlon Retrieval Centre should be 
finalized as a matter of urgency. Speedy determination of the host 
computer system to be employed by the National Library Is a matter of 
the first importance for National Library EDP system planning purposes. 
Guaranteed access to hardware with suitable operating, communications 

and data base management software, Is a prerequisite for the liq)lementatlon 
of an Integrated on-line data base serving multiple applications Including 
the CANUC EDP system. Lack of such access would have severe negative 
implications for National Library EDP system planning and implementation 
would hinder the integration of National Library EDP app3.ications and 
would critically weaken* the economic feasibility and practicality of 
a CANUC EDP system - the centre-piece of National Library EDP systems 
to be Implemented in thcT foreseeable future. 

2. Liaison between the future participants in the Centre should be strengthened 
to assist planning, to ensure maximum compatibility and to help the sharing 
of information and experience. Liaison concerning software development is 
of particular importance. The various participants in the Centre may 
commit themselves to different operating systems, communications monitor 
and data base management software, hindering future operation on a common 
facility and hindering future integration of files and systems. While a 
single set of software may not meet all the needs of each participant, 
liaison should assist agreement on at least common operating system and 
communication monitor software and on preferred programming languages. 

3. The National Library should specify in more detail the data base structure 
required for its purposes and should carry the study of data base management 
software packages to the benchmark testing stage. Data base management 



Chapter V shows that the CANUC EDP system will have to process a higher 
volume of records than all the other planned National Library EDP applications 
put together. It also shows that on-line operations are required for efficient 
CANUC EDP system performance. 
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(software, such as CAN/OLE^ presently used or being developed by other 
Library and Information Retrieval Centre participants should be studied 
before final selection of software for benchmark testing. 

A. Following testing and selection of data base management software, similar 
detailed analysis including benchmark tests should be performed in respect 
of communications monitor packages. 

5. Following these stages - and given adequate and economic host computer 
system availability - the National Library should proceed to Implement 
an integrated data base employing the defined structures and chosen data 
base management and communications monitor software. 



5.1 



V. THE NATIONAL LIBRARY BIBLIOGRAPHIC DATA BASE; BASIC STRATEGY 



The principal factors in the design of the National Library's bibliographic 
data base are: 

1. The requireaents of the systems the data base 

. is to support 

2. Resottirces available - in particular » the hardware 
to be used; software packages available; resources 
available for software developMnt; development 
timescale. 

Chapter III dealt with tlmescales; chapter IV d#«lt with hardware and 
software; the present chapter outlines the salient requirements of the 
systems the data base is to^ support. 

THE MAIN CURRENT BIBLIOGRAPHIC RECORD INPDTS 

Table 5 below shows the main inputs for the years 1974-789 assuming 
Implementation of systems according to the schedule shown in Figure 2. 



Table 5 



Projected major current bibliographic record Inputs to the 
National Library data base 1974-78 



Subsystem/service 



Canadlana 



.<a) 



NL non~Canadlana cataloguing^ 

NL Acquisition^ 

Current US, UK, French y^ennan, 
Australian MARC recorded 



ISDI 



CANUC - accessions reporta 
CANUC - location requestiS) 
CANUC - tltlei^^ 



1,000 machine readable records: 



1974 


1975 


1976 


1977 


X978 


25 




Jo 




/. a 

HO 


9 


16 


23 


30 


35 










50 


221 


264 


301 


341 


381 


52 


11 


16 


22 


26 








1,655 


1,705 








9 


13 








21 n 

1 


216 



Notes : 



® 



© 
® 

(f) 



CAN/SDI and CAN/OLE applications are omitted. At present CAN/OLE does not 
cover the social sciences and humanities. CAN/SDI services related to these 
subject fields (e.g. ERICTAPES service) do not entail record Input to a 
retrospective base. If CAN/OLE retrospective search services were commenced 
for records such as US MARC, very serious consideration would have to be given 
to the means of consolidating CAN/OLE and National Library data base files of 
these records. 

Authority records are not Included In the table as It Is assumed that they 
are made up of edited portions of other records shown. 

Source: 

Source : 



Annex E, p. 4. 
Appendix 2, p. 2, 



Subsystem assumed to become operational In 1978. Bibliographic input assumed 
to equate roughly with NL non-Canadlana cataloguing 1979, plus acquisition 
of foreign Canadlana. 

Source: Annex F, Appendix 1, Serial and non-serial reports have been 
aggregated. Serials are projected to account for only 20,000 accession 
reports per year in 1977 and 1978. 

Source: Annex F, Appendix 3. Following Table 13 of the Canadian union catalogue 
location requests survey , only 6Z of non-*serlal location requests are assumed 
likely to relate to material In the EDP system In Its first year of operation, 
and 12Z In its second year. Publication of supplements will lower these figures 
In practice, and 3X and 6Z have been used respectively In the table entries for 
1977 and 1978. Following union list publication, serial requests directed to the 
National Library are assumed to fall to 6|000 per year. 

(E) Source: Annex F, Appendix 2. 
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The important conclusloiui to be drawn from Table 5 are: 

!• The CANUC EPF syatem 

This la the key to any Integrated EDP syatem design for the National 
Library, since It will have to process a larger volume of records than 
all the remainder of the National Library's EDP applications put together 
when It becomes operational In 1977. 

2, Data Input 

The critical feature In the design of the CANUC EDP system Is the means 
by which the Input of some 1.7 million accession reports per ye&i: Is to 
be achieved. Since an average of 8 locations per title Is projected. 
It follows that after an Initial period any accession report will have 
on average a seven-eighths (88Z) chance that Its blbllogsaphlc data has 
already been Input to the EDP system. The chance will be higher than 
this If other machine readable records without locations are also on 
file - for Instance, MARC records. In this situation an on-*llne Input 
system facilitating use of existing machlne-readable data will be much 
more efficient than a batch Input system. This argument Is crucial to 
data base design, and Is therefore examined In greater detail below. 



B. ECONOMICS OF ON-LINE INPUT AND STORAGE IN A 
HYPOTHETICAL CANUC EDP SYSTEM 

Table 6 below analyses the econbmlcs of a hypothetical CANUC EDP system 
employing : 
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(a) An on-line data base occupying 800 million bytes of disk storage. 

(b) 18 terminals to input 1,705,000 accession reports per year - the 
volume of reports projected in Annex F for the year 1978/79. 

The number of terminals is computed, and is not an assumption. Assumptions 
and computation are explained in greater detail in Appendix 1. Additional 
background concerning the model system assumed A provided in Annex B. 
What follows below is a summary of Appendix 1 analysis. 

Reports are either keyed in their entirety or input by modifying machine- 
readable records already in the data base. Alternative assumptions are 
made concerning the length of records vhich are keyed and stored. 

In "case 1" shown in the table, accession reports of 275 data characters 
are input and held in the base where they occupy on average 400 characters; 
the latter figure allows for control characters, directories, data base 
indexes and an assumed 85% disk utilization. It is much cheaper at $0.12 
to modify an existing record than to key an entire record at $1.43. Nowi 
the annual on-line disk storage cost for a 400 character record is only 
$0.04; in theory, therefore, it would pay to keep a 400 eharact^r record 
over 30 years in on-line storage if it . resulted in saving the total keying 
of another accession report for the same title. The argument is theoretical: 
in practice, disk rental is not the only cost of keeping records in on-line 
storage - for example, the data base has to be fiiaved regularly for regeneration 
purposes and it has to be reorganized. On the other hand, an average of 
8 accession reports are expected for each title and this powerfully reinforces 
the economic benefits of retaining records on-line: on average the keying 
effort to be saved for each accession report title is 8($1.43 - 0.12) = $10.48. 

••Case 2^^ in the table illustrates what happens to the economics when longer 
records are keyed and stored: as keying costs are so much greater than 
storage costs, the case for storing records in order to save keying is even 
stronger. A record length of 1,000 data characters is chosen as illustrative 
of the position for National Library cataloguing and Canadiana: these 
applications employ records of something near this length. 
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"Case 3" In the table llluatratea the position where 1,000 data character 
records are held In order to save the keying of accession reports of 275 
data characters. In theory, It Is still worth storing a 1,000 data 
character record for over a decade In order to save the total keying of 
a further single 275 data character report for the same title. 

Table 6 : Economic^ of on-line Input and storage In a hypothetical 
CANUC EDP system 



Case 


Record length 
In characters 


Cost of data entry 


Number of 
decades It Is 
economic to 
hold records 
on-line 


Modify 
existing 
on-line 
record 
$ 


Input 
new 
record 
$ 


difference 
$ 


Annual 
cost of 
on-line 
storage 
$ 


No. data 
characters 


Av. length 
In machine 


1 


275 


400 


0.12 


1.43 


1.31 


0.04 


3 


2 


1,000 


1,235 


0.12 


5.21 


5.09 


0.12 


4 


3 


275 


1,235 


0.12 


1.43 


1.31 


0.12 


1 



Notes ; 

@ The costs shown here Include disk , storage, terminal rental and labour 
costs plus overhead. They do not Include software, development, 
maintenance, communication and computer processing costs. For more 
"detailed explaxuttlon of assumptions and computation, see Appendix 1. 
Appendix 1 also {shows how the figure of 18 terminals for data Input 
was calculated. For additional background concerning the model system 
see Annex B and Chapter VI. The figures shown here differ slightly from 
those In Annex B: the differences are accounted for by use 
of different throughput assumptions ^ and different keying procedures. 
The figures here assume Annex F throughput statistics. 
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lable 6 Is Illustrative and theoretical, and excludes several categories 
of cost. The exclusions do not affect the basic conclusions to be drawn 
provided: 

1. Hardware 

Disk storage and conqputer time are charged on a cost-recovery basis. 
This Is a reasonable assunqptlon In the context of the Library and 
Information Retrieval Centre. 

2. Software and development 

Existing data base management and communications monitor software 
can be employed - otherwise software and development costs and 
tlmescales are unlikely to be acceptable. This Is a reasonable 
assumption In the light of the conclusions of Chapter IV. 



3. Communlca t Ions 

Terminal and communications costs are kept within bounds by ensuring 
that terminals and tied lines are well utilized. It is assumed that 
the terminals would lnltla^y be in the National Library and National 
Science Library » and would be connected to the Library and Information 
Retrieval Centre in the Ottawa area. 



With these qualifications it IS possible to draw several conclusions from 
Table 6: 

1. On-line input 



An on-line GANUC EDP system operating in a single shift mode would 
require of the order of 18 terminals for straight input purposes, 
assuming that all accession reports must be input via terminals. 
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In fact, a number of projects and Institutions such as Ontario Universities* 
Lilrary Co-operative Project would wish to submit jmchlne-readable accession 
reports. This would reduce the number of terminals required for on-line 
lT-:put. On the other hand, terminals would also be required for editing 
and data base maintenance purposes - for Instance, authority file maintenance. 
The Important point Is that the number of termf.nal8 Is not excessive, and 
the system Is practical. 

2. On-line storage costs 

This Is no longer the prime constraint on data base strategy, and storage 
costs are likely to fall (perhaps considerably) over the next few years. 
At the present time the prime constraints appear to be: 

(a) the size of data base that data base management software can 
handle within acceptable response time limits. 

(b) on-line storage available with dedicated access. 

While cost Is no longer the prime constraint. It is still considerable 
for a large data base. The difference between the annual costs of storing 
275 data character and 1,000 data character records is $(0.12 - 0.04) = $0.08. 
For a data base of 2 million records this represents a difference in storage 
cost of $160,000 per year. There is a clear economic advantage at present 
prices in removing less used records and record data fields from the on-line 
portion of the base as the base grows. 

Against savings achievable by restricting record lengths and record numbers 
in the on-line base have to be set access delays associated with relegation' 
of data to off-line files, and the processing costs of searching potentially 
lengthy tapes to retrieve off-line data. The savings to be achieved are 
lower in the early years of the data base when it is smaller, and in these ' 
early years there is likely to be some iinused disk storage capacity, since 
the installation holding the base has to provide for future growth. In 
addition, a fairly large initial data base is desirable in order to save 
keying costs and provide support for cataloguing and acquisition processes. 
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In these circumstances a cost-effective strategy will be to commence on-line 
data base operation with as many potentially useful records as possible - 
records containing all potentially useful data fields. Use of the data base 
should be carefully monitored so that its growth can be controlled in a 
cost-effective manner. An important aim of monitoring will be to identify 
lower use records and record data fields. Following identification, these 
classes of data can be eliminated from the on-line file and placed in off-line 
storage. This 'purging' of the on-line base file will help to hold down its 
growth and to keep it within an economic and practical "^size limit. If a 
degree of stability in size can be achieved in the first five years of data 
base operation, there is a very good chance that falling storage costs and improved 
hardware/software facilities will allow subsequent growth without unacceptable 
economic and practical penalties. 

3. Size of records in the dkta base 

Following the above reasoning M^C records chosen as likely to be useful for 
cataloguing or accession report input purposes may be held for several years. ^ 
All the useful data fields may be included in the on-line file: the on-line 
file can serve many different applications including CANUC accession reporting, 
CANUC location service, CANUC publication, Canadiana, National Library EDP 
cataloguing. National Library acquisitions, and catalogue support services for 
Canadian libraries. A bibliographic item might be entered into the on-line file 
for one of these applications (say. National Library acquisitions) and 
subsequently upgraded and used by other applications (say. National Library EDP 
cataloguing and CANUC). 

4.7 Data base strategy 



Summarizing points 2 and 3 above, the National Library should plan to use an 
integrated on-line data base containing CANUC, National Library acquisitions 
and in-process files, and selected MARC records, in which the data for each 
bibliographic item is held only once. All potentially useful records and record 
data fields should be added to the initial on-line file and monitoring should 
be used to identify lower use records and data fields. Lower-use records and 
record data fields should be removed from the on-line file as soon as they have 
been identified by the monitoring process. 
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While the data for any title will be held only once in the on-line 
base, off-line copies of this data will naturally be maintained for 
regeneration purposes, and some off-line formatted output files will 
probably also be held in order to save output costs for high volume/ 
high frequency outputs. It may take some time to integrate all systems, 
but CANUC EDP system planning should employ an integrated data base 
concept. 

The integrated data base concept is explored more fully in the next section. 



C. APPLICATION OF AN INTEGKATED ON-LINE DATA BASE CONCEPT 

This concept is illustrated below in Figure 4, which is based on analysis 
contained in Annex A. The unshaded areas represent off-line files which the 
National Library may maintain. The shaded area represents the on-line file: 
the ratio of the shaded area and unshaded area is not to scale and is not 
significant. 

The large totally shaded circle is the on-line "CANUC file" including all 
union catalogue, union list, and National Library catalogue bibliographic data. 
The "MARC/ISDS files" circle overlaps with the "CANUC file" circle; overlap 
represents titles having dual I4ARC/ISDS and CANUC status. The overlapped titles 
are all on-line and so are some of the non-overlapped MARC/ISDS titles. 

The unshaded portion of the "MARC/ISDS files" circle represents MARC/ISDS 
bibliographic items held off-line; entries for these items will be included 
in the on-line indexes - at least for the first years of integrated data base 
operations. The result of searching the on-line numeric or author/title indexes 
for one of these titles will be a message to the effect that the title exists 
in an off-line file. Retrieval from the off-line file will be possible according 
to some pre-determined schedule. For example, off-line items requested one day 
might be made available on-line the following day. 
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Overlapping the two large circles Is the small ''Files for Internal use" 
circle. Most of these files are on*-llne» but not all. It will probably 
be unnecessary to keep a number of non-blbllographlc files on-line - 
for Instance, mailing list flics. Bibliographic Internal files, such 
as the acquisitions In-process file and the cataloguing In-process file, 
will be maintained on-line. 

Overlapping all of these circles Is the central broken line circle 
representing the central bibliographic authority file. This Is an Important 
refinement of the Integrated data base concept, although It Is possible 
to use a bibliographic data base without extensive authority files. The 
Ohio College Library Center data base Is an example of a base which as yet 
does not have a well-developed authority file system. 

It Is envisaged that the central authority file will be developed to 
include: 

1. Names, references and uniform titles 

Personal, corporate, conference names, uniform titles, 
and references from alternative forms or spellings of 
these names and titles to the preferred forms and spellings. 
Where appropriate, both English and French versions of 
these would be held. 

2. Subjects 

Subject headings In both English and French, possibly 
linked with associated classification ntimbers. 

The main purposes of these files are to: 

1. Aid consistency In bibliographic recording - particularly 
cataloguing. A high level of consistency raises search 
and retrieval efficiency. It Is also an Important criterion 
of a high quality catalogue support service. 
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2. Facilitate data base maintenance , correction and aearcK. 
If a given author heading ia held once in the authority 
file rather than in many Individual data records » correction 
ia facilitated. The links from the authority file to 
individual records are also of use for search purposes* 

3. Compact the data base. 

Storage space is saved by holding headings once in an 
authority file rather than replicating them through many 
individual records. 



Figure 4 ; Diagram illustrating the National Library integrated data 
base concept 
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D, ORDER OF MAGNITUDE ON-LINE FILE SIZE 

It has to be shown that the proposed Integrated on-line data base will not 
become unmanageably large in size. The growth in the size of the largest 
portions of the data base - CANUC and MARC/ISDS - is illustrated in Table 7 
below. It is emphasized that this Is only an illustration based on stated 
assumptions. The basic approach accords with those recommended in Section B 
of this chapter, but the mode of purging the baje to stabilize its size and to 
save storage costs is somewhat crude, being chosen for simplicity of exposition. 
The illustrated strategy is as follows: 

1. MARC/ISDS records chosen for holding on-line will be ISDS, US MARC, 
UK MARC, French MARC. Other MARC records will be off-line. 

2. MARC/ISDS titles not absorbed into CANUC within three years will be 
removed from the on-line file. To assist exposition, absorption/ 
purging is shown as occurring in the second year. In fact, the 
absorption would mostly take place at the end of the third year. 

To provide perspective, it is pointed out that the installation on which the 
national bibliographic data base of France will be stored will have 2,400 
million bytes of storage. The table shows that the on-line size will be well 
within this type of limit for the first ten years of the data base life. The 
size is also within the range that existing data base management packages can 
handle. The indicated purging technique is almost certainly not the best 
choice: monitoring or record use in the first years of operation should be 
used to establish the optimum record selection technique to be employed. 

Practical experience can also assist the choice of off-line storage medium for 
given categories of record. Some categories of record may be held on magnetic 
tape and some on off-line disk packs. Use of off-line disk packs is employed 
successfully at the University of Toronto. 

Monitoring will also establish which data fields should be held on-line for 
lower use Items, and which data fields for the same items should be held off-line. 
For instance, full bibliographic data might eventually be held on-line for records 
selected as likely to be of use for cataloguing purposes -in general, newer 
records. At the same time, only short bibliographic data might be held on-line 
for records selected as being of use solely or mainly for location purposes - 
O in general, older records. ^ 
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Table 7 : Illustration of the growth In size of the CANUC and MARC/ISDS 
portions of an on-line data base Riven an assumed data base 
strategy 



Year 
Commencing 
1st April 




Mill 


ion t 1 


ties 






Million bytes 
disk storag^ 


H^C/ISDS 


CANUC 
Input 


Total net 

Annual 

Input 


Data 
base 
size 


spectlve 
Input 


Tnoiir"^ 


AH Aoi^n^ 4 on / 


A 


B 


c 




A+B-C+D 


1977 




0.2 




0.2 


1.4 


1.4 


900 






0.3 


0.4 


0 2 


0 1 


1 5 


965 


1979 




0.3 


0.5 


0.2 




1.5 


965 


1980 




0.3 


0.5 


0.2 




1.5 


965 


1981. 




0.4 


0.5 


0.2 


0.1 


1.6 


1,029 


1982 




0.4 


0.6 


0.2 




1.6 


1,029 


1983 




0.4 


0.4 


0.3 


0.3 


1.9 


1,221 


198A 




0.4 


0.4 


0.3 


0.3 


2.2 


1,415 


1985 




0.4 


0.4 


0.3 


0.3 


2.5 


' 1,608 


1986 




0.4 


0.4 


0.3 


0.3 


2.8 


1,800 



Notes 



(a) Annex p. 4 shows 1.34 million records; page 9 of the same Annex shows that the 
number of titles Is approximately 77Zof this number. 77Z of 1.34 = 1.0 

0 Annex page 9. 

© It Is assumed that the Initial retrospective file will be retained longer than 
subsequent MARC/ISDS Input, since It Is not realistic to assume that absorption 
could run markedly In advance of CANUC Input. It would In any case take some 
time to deal with the 1 million records and time will also be needed to monitor 
record use. Absorption for the backflle Is assumed to take place over the years 
1978-82, with final purging during 1982. 

@ Annex F, Appendix 2. In accord with the figures In this appendix, an annual 
Increase of 6000 In the number of titles reported Is assumed. 

(S) It Is assumed that the data base Is composed of: 5Z non-MARC titles occupying 
400 bytes storage each and 95Z MARC titles occupying 656 bytes each. 
This Implies an average storage length per record of 643 bytes. The 656 bytes 
Is derived from the average length of US MARC records In distribution format of 
636 characters", conq^acted to the extent 78% achieved by Ohio College Library 
Q Center, plus the 74 Index characters assumed on p. 4-2 of Annex B. 85Z disk 
ERJ[C utilization Is assumed, as In Annex B. 



E. APPROACHES ADOPTED BY SIMILAR PROJECTS ELSEWHERE 



The data base strategy outlined above is very similar to 
the strategies adopted or plaimed for similar data bases elsewhere. 
Annex H describes national bibliographic EDP development plans for 
France and Sweden. 

The Bureau pour I'automatisation des bibliothiques, France, plans to 
add all records of national MARC services to its projected on*-line 
CAPAR (Catalogue partagg) file and to make these available on-line to a 
number of libraries from 1st January 1975. In 1977 the Bureau will have 
a new computer installation on the Isle d*Abeau with some 2,400 million 
bytes of disk storage capacity and 512K core storage. 

In the Swedish LIBRIS project 

"All of the university libraries are being connected to a central 
computer which amongst other things contain a data bank with full 
bibliographic information on the total stock (books, periodicals, 
etc.) in the country's academic libraries, together with information 
on acquisitions, suppliers and references" . 

In the US, the Ohio College Library Center maintains an on-line file of 
some 0.8 million titleB,nearly half of which are full US MARC records^^ 
held in a format rather more compact than the US MARC distribution format. 
The remaining records are those created by OCLC member libraries with as 
much bibliographic detail as they require for their use. 

Apart from broadly common data base approaches these projects are similar 
in another respect: they are all major long-term enterprises involving 
considerable expenditures. As noted In Annex H, In 1974 the Bureau pour 
l*automatisation des bibllothSques will spend Its annual operating budget 
of 3 million Francs (» $0.6 million) plus an equipment allocation of 
4.5 million Francs (er $0.9 million): a total expenditure in one year 



65 



42 - 



of some $1.5 million. The long-term nature of this project is indicated 
by the nature of its planning vhich spans at least two five-year plans. 

Ohio College Library Center had a total expenditure of some $860,000 in 

25 

the year ended 30 June 1973 . Although the Center's on-line cataloguing 
service commenced operation in 1971 it will be several years before the 
Center's facilities and services cover those already projected. Full 
development of the Canadian national bibliographic data base and the 
services supported by this base will be a major enterprise invdlving 
considerable expenditures over the greater part of a decade. Special 
Canadian requirements, such as those associated with bilingualism and 
network development in a country of Canada's geographical size, will add 
to the already coiq)lex problems related to the development of a large 
national bibliographic data base and the services to be provided from this 
base. The gains of undertaking this type of development may be inferred from 
the way in which the OCLC system is being replicated in different regions of 
the USA on a self-supporting basis and from the national plans of France and 
Sweden, as well as the more direct National Library, CANUC, and other benefits 
which may be realized in Canada. 
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VI, FURTHER ASPECTTS 



A, NETWORK DEVELOPMENT 

Chapters IV and V propose the basic central strategy which should guide 
the planning and inplementation of the National Library *s bibliographic 
data base and the systems which this base is to support over the next 
five years. However, it is iiq)ortant to emphasize that this strategy 
will need to be developed within a changing national and international 
bibliographic EDP network context. Such networks are already in existence 
or taking shape - to mention only four developments; 

1. The National Library is already a full participant in the International 
Serials Data System and the international MARC exchange network. 

2. Within 1974 the National Science Library's Canadian On-Line Enquiry 

(CAN/OLE) pilot project will link some 15 centres across Canada by 

terminal to the NSL*s CAN/OLE data base. The National Library will be 

2 6 

one of the participants in this project • 

3. Within 1974 the National Library will be linked by terminal to the 
Ontario Universities' Library Co-*operative Systems monograph demonstration 
project, which will have some eight other participating libraries in 
Ontario and Quebec. 

4. Within 1974 the National Library in co-operation with the National 
Science Library plans to be linked by terminal to the Ohio College Library 
Center to participate in the US/Canadian co-operative project for the 
conversion of serials (CONSER). 
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In this context t conclusions drawn following an examination of French 
and Swedish network development platii (Ire relevant. The following 
corcluslons are quoted from Annex H: 

"1« France and Sweden will achieve their national bibliographic EDP 
networks much faster and more cheaply than would have been the 
case If planning had not been attempted. 



2« The resulting networks will operate very much more cost-effectively 
than networks based on uncoordinated development* 



3« The case of Sweden shows that planning can be done effectively 
even though control Is not totally centralized* The Swedish mode 
of planning and network development may have some relevance to 
countries like Canada » Britain and the United States where control 
of libraries and Information services Is relatively decentralized 
compared with France*" 

The need for network planning Is underlined by the following conclusions 
drawn In Annex C: 

"On-line cataloguing services would be provided more economically 
by a single national centre than by multiple regional centres » 
assuming that regional centres replicate national centre hardware 
and basic facilities *** however » there are other factors besides 
economics and other EDP services required by Canadian libraries 
In addition to on-line cataloguing services* A single national 
centre could not provide all the EDP services required by all libraries 
In Canada - for Instance^ such a centre could not have sufficiently 
fast Implementation to provide total selection » acquisition^ cataloguing » 
circulation and IR facilities within the tlmescales required by Canadian 
libraries* Also It Is not clear that a single national file could or 
should record^ the current loan or In-process status of every copy of every 
bibliographic Item In Canada* As far as coiq>rehenslve library automation 
facilities are concerned » a measure of decentralization Is both Inevituble 
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and a present fact. Beyond: catalogue support, therefore, the 
concluiilons of this study merely serve to point to the need for network 
planning in order to achieve the benefits of co-ordinated development 
and to minimize duplication of hardware and other facilities. In general 
it is expensive to replicate items such as mainframe hardware and the 
software development staff working on the software to be used by the 
mainframe hardware." 

In the light of these conclusions it is recommended that the National Library, 
in co-operation with other libraries across Canada, should work out a plan for 
Canadian national bibliographic EDP network development. This plan should be 
phased and should show for each phase the extent of the network, including: 

- details of participating organizations 

* location and nature of hardware employed 

- location and nature of files accessible to the network 

- communications facilities 

- services and facilities offered by each participating organization. 

This recommendatfefi/accords closely with the views of the Canadian Computer/ 
Communications Task Force, exemplified in the following recommendations of the 
report of the Task Force entitled Branching Out i 

1. "Computer /communications (i^e. computer services by remote-access 

through communications facilities) should be recognized by governments 
as a key area of industrial and social activity, and steps ahould be 
taken towards strengthening ... co-ordination of its development to 
the benefit of Canadian society." 

2. "In the formulation of national computer /communications policy a 

unified approach throughout Canada should be stressed as a key factor 
requiring close co-ordination between federal and provincial actions." 

3. "In the area of federal responsibilities a Focal Point should be 

established within the government for co-ordination of the development, 
formulation and continuing evaluation of national policy in all matters 
pertaining to the field of computer /communications." 
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Inmedlate tasks in the area of bibliographic EDP notwork planning 
include: 

- the strengthening of organizational arrangements and the provision 

of additional resources to support Joint national /provincial planning 
concerning library and information service development, 

- Canadian National Union Catalogue system planning » for instance the 
manner in which libraries will report when the new CINUC EDP system 
is operational • While many libraries may wish to continue reporting 
exactly ^as they do now» some will wish to report in machine-readable 
form. The mode and timescale of this machine-readable reporting needs 
to be worked out in detail. 

- Planning of catalogue support services » both off-line and on-line. 
Joint national/provincial planning is required since some services 
will be provided by inter-provincial and intra-provincial projects 
(for e3cample» Ontario Universities* Library Co-operative System) and 
some may be provided by foreign or international projects Jifov example^ 
Ohio College Library Center). 

- Active participation in international bibliographic EDP network planning 
and development » for instance » MARC» ISDS and UNISIST network planning 
and development. 



B, NATIONAL LIBRARY HARDWARE FACILITY REQUIREMENTS 

Data base and system strategy recommendations in Chapters IV and V have 
implications relating to National Library hardware facility requirements. 
Use of the type of software indicated in Chapter IV Implies the following 
order-of-magnitude core storage requirements: > , 
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Kilobytes 



User application programs: allow 250 

Data base management software, (e.g. ADABAS llOK) ) 
Communications monitor software, (e.g. Interconm 40K) ) 500 
Operating system, (e.g. OS/VS or OS/MVT, allowing 350K) ) 

"750 



Employing the type of data base strategy Indicated In Chapter V, disk storage 
requirements would be of the order of 1,200 million bytes for the first few 
years of the integrated system. This allows 200 MB for direct access storage 
of data base management package and operating system software, and assumes that: 

(a) The total number of titles in the on-line data base approximates to the 
total number of projected CANUC, ISDS and French, UtC and US MARC tiUes. 

(b) An. average of 1,000 data characters per title is sufficient for support 
of applications inqplemented in the first few years of integrated system 
operation. 

While these assumptions are acceptable for order-of*-magnitude projection 
purposes they will need to be checked when hardware requirements are studied 
and specified in detail. 

Based on these assumptions it is possible to draw out an illustrative computer 
hardware configuration with the basic facilities needed for adequate support of 
a National Library on-line integrated bibliographic Information system. An 
illustrative configuration is shown in Table 8 and Figure 5 below; terminals 
are not shown since the number required depends very much on detailed system 
design. The configuration employs IBM hardware because this is used by the 
majority of the organizations which will participate in the Library and Information 
Retrieval Centre. The Centre would need a larger configuration: Table 8 and 
Figure 5 indicate the general scope of the facilities the National Library might 
expect to use within this larger configuration. It is enq^ihaslzed that Table 8 
and Figure 5 should be Interpreted in the context of the Library and Information 
Retrieval Centre: a stand-alone National Library configuration has not been 
considered *- this would require an entire economic feasibility study in Itself. 

The costings are also illustrative: they are IBM basic monthly rental prices 

at February 1974 levels. Tendering and contractual terms could make a substantial 

difference to the final cost of an installation of the type illustrated. 
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Table 8 : Computer conf iRuratlon llluitratlng the basic facilities 

raqulred for adequate support of a National Library on-'llne 
Integrated bibliographic Inforaatlon system 



Quantity 



1 
4 



Descripitlon 



IBM 370/145 central processer with 
768 KB, console and niecessary 
accessories 



3333-11 disk unit: controller, plus 
400 MB storage 



3330-11 double density disk drives 
with 400 MB each 



3803 Tape drive controller 

3420 Tape drives 800/1600 bpl, 320 KB 

transfer rate 



3705 Transmission control unit 



3211 printers with ALA print chains 
and all necessary accessories 



card reader /punch 



Total 



$ /month 



23,000 



2,530 



4,240 



) 

) 4,000 
) 



2^000 



3,300 
1,500 

40,570 



ERIC 



72 



- 49 - 



3525 PUNCH 



3505 READER 



TO LOCAL 
t REMOTE 
TERMINALS 



3705 

TRANSMISSION 
CONTROLLER 



3330- n 



400 MB 



1 3333- 11 
.DISC UNIT 



X 



3211 

PRINTERS 



a.-. 




3811 
CON- 
TROLLER 



*31A5 CPU 
(768 KB) 



CON- 
TROLLER 




{ 



3216 ALA 
PRINT CHAIN 



3210 
CONSOLE 



3803 
CON- 
TROLLER 






3420 



3^20 




3^20 



'^^ ^^♦20^ 



■ Built-in Controller 
for card reader and punch 



Figure 5 : Computer configuration Illustrating the basic facilities 
required for aoequate support of a Natl om Library on-ll 
integrated bibliographic Information" system 
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C. USE AND TREATMENT OF KARC AND ISPS RECORDS 

In drawing out a basic National Library bibliographic data base strategy 
In Chapter V, It was assximed that the National Library would acquire ISDS, 
US MARC, UK MARC and French MARC records and would add these (Initially) 
to the on-line base. MARC tapes of other countries might be acquired 
and redistributed, but It was assumed that It was unlikely that there 
would be economic Justification for adding these to the on-line base. 
The latter assumption was made because languages other than English and 
French account for a low proportion of use, and a low proportion of 
material In the base. English and French language material accounts for 
some 88% location requests; German language material accoimts for Just over 
6Z and all other languages account for the remaining 6% • Further, as 
noted In Annex D, the participants In the Canadian MARC pilot project had 
no definite requirements for MARC tapes other than those of *th^ US, UK 
and France. Australian material of prime Interest to Canada Is assumed 
to be Included In the US MARC tapes. 

Addition of (selected) ISDS, US MARC, UK MARC and French MARC records to the 
on-line base would be useful for several purposes: 

1. CANUC accession reports 

As Indicated In Chapter V these reports require a high-volume Input 
operation. Maintenance of bibliographic data on-line will significantly 
reduce the otherwise heavy costs of keying reports. ISDS record data 
can also be used for on-line ISDS /Canada master file purposes. 

2. Canadlana/Natlonal Library cataloguing 

The records will have an on-line catalogue support function for these 
applications, speeding cataloguing and reducing keying costs. 



ERIC 



74 



- 51 - 



3, MARC nctvdrk services 

Maintenance of the records on*-llne will enable the National Library to 
provide fast response In record selection and provision on behalf of 
the Canadian MARC network *- reducing the need for Canadian libraries 
to hold large backflles of MARC/ISDS records. Equally important from 
the point of view of the network, upgraded records can be supplied. 
For example 9 where the National Library has acquired and catalogued a 
foreign title, request for the foreign MARC record for that title can 
be met with the record as upgraded and used by the National Library. 

These advantages will accrue without an on-line catalogue support 
network; once such a network service Is established, maintenance of 
records on-line will be mandatory. 

A further question relates to the format In which records are held. Annex G 
Indicates that US/Canadlan, UK/Canadlan and Intermarc /Canadian monographic 
format machine translations should be acceptable for most practical purposes 
and that translation costs are unlikely to be excessive. It will, however, 
take several man-months to specify, and to bring each format translation program 
to operational status. It Is assumed that ISDS and foreign MARC records will 
be translated to the basic National Library data base format as soon as they 
are received. Where these records are redistributed, the preferred format 
will be the Canadian MARC format, as Indicated In Annex D. 
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D. PUBLICATION OF CANUC SUPPLEMENTS 

The CANUC EDP system will have the facility of producing supplements to the 
main union catalogue, where the latter has been published. Publication of 
supplements will affect the number of location requests directed to the 
National Library which might be serviced via the EQP system. As shown In 
Table 5, the number of such requests Is too low to affect basic data base 
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design and strategy. However » It Is still of Interest to consider the 
likely costy mode and pattern of supplement publication. Annex B analyses 
the costs of publishing supplements In a range of forms: 48x reduction 
microfiche 9 150x reduction ultrafiche » and hard copy. 

Costing has to assume definite sequences and publication patterns. Discussions 
were held with National Library staff to determine the assumptions to be eiq>loyed» 
and these discussions led to the selection of a two-section catalogue. The 
two sections are: 

1. Bibliographic data section 

Bibliographic entries In the order of addition to the catalogue » 
referenced by a running machlne-*generated number » shown In sample 
layout In Appendix 3. 

2. Index section 

ISBN» LC/Canadlana number and author /title Indexes to the master 
catalogue section. These are also shown In sa]q)le layout In 
Appendix 3. 

This arrangement was chosen for several reasons: 

* Location searching Is speeded since » for any given search^ It will In 
most cases ^only be necessary to consult one of the Indexes. 

* Full bibliographic data Is provided for resolution of queries relating 
to requests. This data may also serve a catalogue support function. 

* Costs of providing relatively complete blbllogi^aphlc data are minimized 
since It Is not necessary to cumulate and re-Issue the bibliographic data 
section. It Is assumed that additions to this section would be Issued at 
the same Intervals as the Index section. Alternative Index Issue frequencies 
assumed for costing purposes are noted below. 
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Pour cumulation patterns were coated for the Indexes: 



(a) Issues every 2 inonths cumulating continuously to a 
2 year volume, then starting afresh. 

(b) Issues every 2 months cumulating continuously to a 
5 year volume, then starting afresh. 

(c) Issues every 6 months cumulating continuously to a 
2 year volume, then starting afresh. 

(d) Issues every 6 months cumulating continuously to a 
5 year colume, then starting afresh. 

Total subscription costs were calculated for the total catalogue Including 
bibliographic data and Index sections. These costs Include: 

* Computer costs of extracting records from the base, sorting and formatting 
for printing 

* Printing In 300 copies 

* Distribution and administration * applied as a percentage to direct costs. 

Table 9 shows the total annual subscription to be paid by each of 300 subscribers 
assuming that catalogue supplements are sold at cost. 



Table 9 : Theoretical subscription costs for catalogue supplements under 
alternative assumptions 



Index issue rate 


Average yearly per-subscrlber cost ($) 


Microfiche 


Ultrafiche 


Hard copy 


(a) Every 2 months for 2 years 


727 


879 


3400 


(b) Every 2 months for 5 years 


1616 


1606 


8949 


(c) Every 6 months for 2 years 


333 


405 


1488 


(d) Every 6 months for 5 years 


629 


752 


2924 

t 
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Not vlslbie from the costVTs a po^^^ the 
l88ue-every-2-]nonth8 , volume-every-5-years Index-publishing option were 
chosen, the final Issue of the Indexes (not Including the bibliographic 
section of the supplements) towards the end of the fourth year would 
contain 125 thousand pages and would have resulted In a total of two 
million pages having been progressively superceded over the previous 58 
months. Hard copy will, therefore, (In all cases, not Just this worst 
possible one) pose serious doubts concerning the practicality of handling 
and distribution, and Is not recommended. 

For the other media, reading equipment would In most cases have to be 
obtained by subscribers since most libraries do not have 48x or ISOx 
reduction flche readers. However, the rental cost of readers (under $100 
per year for microfiche, and under $300 per year for ultrafiche) Is not 
high enough to affect significantly the choice between hard copy and 
microform. 

Ultrafiche has a distinct handling advantage, since It Involves far fewer 
flche and Is coated with a protective laminate that gives It an expectlonally 
long life. This could be a factor In deciding the media for the main union 
catalogue, the bibliographic data section of the supplements, and the final 
cumulation of Indexes which will be In constant use for a long period, A 
counter-argument concerning durability Is that since microfiche copies cost 
only 20c each It would be Inexpensive to replace damaged or lost microfiche. 

The choice between alternative Index Issue rates Involves weighing the 
advantages of more frequent Issues as against the advantages of less 
frequent Issues: 

(a) Advantages of more frequent Issues. 

More frequent Issues keep the published union catalogue more up-to-date 
and allow a higher proportion of location requests to be satisfied via 
search of the published catalogue. Spreading of Inter library loan 
requests can be encouraged by frequent Issues, each with a new permutation 
of the locations associated with each Index entry. 
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(b) Advantages of less frequent Issues 

Publication costs and subscriptions are lover with less frequent 
issues, and cumulations can be more coiiq>rehensive within a given 
subscription cost, A less frequent issue rate allows a longer 
average time for locations to be reported for newly reported titles, 
facilitating the spreading of the interlibrary loan requests for 
these titles. 

On balance it is suggested that six*nnonthly issues cumulating to five-year 

volumes will be a more cost-effective choice than patterns involving issues 

every two months and/or cumulation to two-year volumes. The principal 

reason for this suggestion is that a relatively small proportion of requests 

relate to the newest titles: only about 12Z of requests relate to material 

20 

less than 2 years old • This makes it less In^ortant to have very frequent 
issues, and more important to have longer periods of cximulation. 

Reference should be made to Annex B for detailed cost and volume computations 
and assumptions. The Annex also describes alternative Computer Output 
Microfilm (COM) processes and equipment and recommends readers for reading 
48x reduction microfiche and 150x reduction ultrafiche. The recommendations 
apply only at the time of writing: in the course of time new readers are 
likely to become available and there are likely to be changes in the price, 
delivery, and service positions of existing models. The reconmended readers 
are: 



48x reduction microfiche 

Bell and Howell, Canada C0M-200 
3M Company, Canada Mlnicat 



$250 (vertical Image on screen) 
$239 (horizontal image) 



150x reduction ultrafiche 

National Cash Register 455-2 $725 

455-3 - $585 (portable reader) 

Prices are purchase prices as at February 1974: they are subject to 
reduction for volume purchase. 

o 7Q ^ 
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Annex B also outlines a poaalbla telephone Voice Answerback enquiry 
service. This Is Included as a possibility which sight be considered 
In the longer term. 
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ECOMQMICS OF ON-LINE INPUT AND STORAGE 
IN A HYPOTHETICAL CANUC EDP SYSTEM 

The purpose of this appendix is to show the eesunptlone and computation used 
to give the figures presented in Table 6 : Economics of on-line input and 
storage in a hypothetical CANUC EDP system . The hypothetical CANUC EDP system 
is that detailed in Annex B with accession ret^'ort volume amended to accord 
with Annex F projections. 

1. Disk storage occupied by records and indexes 

As detailed in Annex B» page 4-2 » each record is assumed to require indexes 
of 74 characters in length. This figure is not assumed to Increase for 
longer records since all essential access points are covered in short records 
and the longer records simply have more data - not more index access points. 

In addition^ disk utilization is assumed to be only B5%^ i.e. records carry 
a 15% data storage overhead. A further overhead is assumed in the next 
paragraph. This assumption is pessimistic: Ohio College Library Center 
coiiq)act US MARC records to 78Z of their communication format length without 
loss of data. They are able to reconstitute the records in full if necessary. 

These assumptions result in the following average record storage lengths 
for data records of 275 and 1^000 characters. 



data record 
characters 


conqputation 


average record 
storage length 


275 
1»000 


(275 data characters + 74 index characters) 
(It 000 data characters + 74 index characters) 


400 characters 
1235 characters 



85 

o 

ERIC 



APPENDIX 1 
- page 2 - 



2. Cost of disk storage 

A data base size of 800 million bytes (MB) Is assumed, occupied by data 
(600 MB) and software (200 MB). The software storage overhead Is Included 
In total data base storage costs, I.e. the cost fot 800 MB storage Is charged 
against the data which occupies only 600 MB. The software storage overhead 
Increases data storage unit costs more In a small base than In a large one. 
The assumption of a base which Is relatively small compared with the 2400 MB 
disk storage projected for the Installation to hold France *s national 
bibliographic data base Is made because conservative rather than optimistic 
costings are preferred. 

1 X 3333-11 (Controller with 400 MB) $2, 530 /month 

1 X 3330-11 (Dual density; 400 MB) $2,120/month 

$4,650/month 

If 600 MB of data Is stored for $4,650/month » 55, 800 /year, then: 

100 bytes of data are stored for $0.93 x 10 per year 
400 bytes of data are stored for $3.72 x 10 per year 
1235 bytes of data are stored for $11.48 x 10 per year 

Terminal operation 

A terminal operator costs $15,000 per year Including overhead. His terminal 
costs $2,400 per year.* He works 220 days per year, keying 4,000 keystrokes 
per hour. He therefore achieves 4.4 million key depressions per year. Assuming 
single-shift operation, the 4.4 million kiey depressions cost $15,000 $2,400 « 
$17,400 per year. A lower cost per million key depressions would almost certainly 
be achieved by use of more than one shift, ^if this could be arranged. Use of 
more than one shift is not assumed, in case union or other difficulties prevented 
it. 

Asstunptions for terminal computation 

1,705 »000 accession reports are projected fqr the year 1978/79. Since It 
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l8 projected that the average title will be reported eight times » 7 out of 
8(88Z) reports will find matches. Assuming that a selection of MARC records 
with zero locations Is also held, an even higher proportion will find 
matches. This proportion la percentage terms Is assumed to be 95Z. 

350 key depressions are required to Input full details of records not on 
file: 275 characters plus 65 characterai for shifts, field Identifiers, 
etc. and a further 10 characters for additional data added because It 
occurs ontthe accession report and appears useful. 

Records already on file are assumed to require 20 key depressions to find 
the existing machine readable report, 4 to add a new location and a further 
6 for additional data or editing. I.e. a total of 30 key depressions. 
Additional data Is assumed to be added where early accession reports contain 
less data than subsequent accession reports. 

Terminals required for hypothetical CANUC EDP system 

Assuming that all accessions reports are entered In a single shift operation 
by terminals, the above assumptions lead to the following coitq;>utatlon: 

(a) Input of records with bibliographic data already on file: 

95 1> 705^000 accession reports x 30 key depressions 

100 4.4 million key depressions per terminal/year ~ termln 

(b) Input of records without bibliographic data already 



on file: 



5 
100 



X 



1,705,000 accession reports x 350 key depressions 
4.4 million key depressions per terminal/year 



6.78 terminals 



Rounded Total 



18.0 terminals 
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6, Unit cotf of lig)uttlng an accwlon report by 
MPdlfylng an axis ting aachlna r— dabls rscord 

Following the above conqputatlon. In 1978/79 there will be 95Z of 
1,705,000 « 1,620,000 additional locations to existing tltUs which 
will occupy 11 terminals and operators 9 $17,400. 

Unit cost per additional locatton • ^^i^620^600 10.1181 



7. Pnlt cost of keying total accession report 

In 1978/79 there will be 5Z of 1,705,000 - 85,000 accession reports 
being Input from scratch and occupying 7 terminals 

Unit cost per totally keyed > ^ $17400 x 7 ^ 
accession report ) 85000 

The above unit cost assumes records of 275 basic data characters. 

A record of 1000 basic data characters will have proportionately higher 

unit costs. 



Utolt cost of Input of ) . 41 Ai^ , 1000 « *c oi i 

1000 character Input ) * 275 '^'^^^ 



B. Unit cost of all accession report Input 

1,705,000 accession reports occupy 18 terminals and operators. 
Over.U unit cost - ^H^i^^.W^ - ♦0.184 
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PROJECTED NATIONAL LIBBAK NON-CANADIANA 



MACHINE READABLE CATALOGUING 1974-78 



The purpose of the projection presented in this appendix is to help 
Indicate; 

1. cataloguing use of the data base in respect of non-Canadiana 
National Library siaterial 

2. likely storage requirements in the data base. 



ASSUMPTIONS 

1. NL non-Canadiana cataloguing output 

The following figures are drawn from the Forecast ef output volumes for 




(a) Full and partial cataloguing;' 
Canadiana and General 



59,609 



119,110 



100 



(b) Entries to Canadiana 



25,561 



42,000 



64 



(a) - (b), i.e. Full and partial cataloguing: 
General 3 



34,048 



77^00 



126 
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The 126X Increase in "General" cataloguing over a five year period 
implies a compound annual rate of increase of 17 .TZ* The figures relate 
to voltimes, copies and entries. Figures for titles provide a more 
meaningful basis on which to project the growth of the National Library's 
machine readable data base. For this reason, the projection below 
relates to figures for titles catalogued, drawn from page 46 of the 
Annual Report of the National Librarian 1972-3 . An annual rate of 
Increase of 13Z compound has been assumed, allowing for the possibility 
that funding and other problems may cause actual performance to lag 
behind budget figures. The result Is shown below: 

1972/3 1977/8 1978/9 

Cataloguing for the National Library Catalogues ^ 

Full and partial cataloguing Non-Canadlana titles 16,592 30,000 34|500 

2. NL non-Canadlana machine readable cataloguin g 

This commenced in January 1974 and is assimed to cover all NL non-Canadlana 
cataloguing output by 1 April 1977. The growth of the EDP system coverage 
is assumed to be linear. 



PROJECTIONS 



Year commencing 



Ist April 


Thousand 


1973 


2 


1974 




1975 


16 


1976 


23 


1977 


30 


1978 


35 




Total 115 



it 

This figure was obtained by subtracting Canadlana figures for monographs, printed 
music, audio-visual materials and microforms from total National Library figures 
for the same materials, since the figures for National Library titles in the 
Annual Report of the National Librarian 1972-73 Include Canadlana cataloguing. 
Q Estimates by the National Library staff were obtained in order to determine the 
ERJC number of periodicals catalogued solely for the general collection. 
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SAMPLE ISBN INDEX 



ISBN 



CANUC 
No. 



Locations 
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0-13-936237-1 02li 

0-13-9362li5-2 02ii 

0-395-12665-7 017 

0-517-50020-5 015 

0-517-50021-3 015 

0-687-*67'»'»-6 006 

0-695-A00'»7-9 009 

0-695-800^17-7 009 

0-8027-6123-2 010 

0-5027-6124-0 010 

0-8130-0323-7 005 



QMM 
QMN 

NSY OKIT OD QMQ SSU NSKS OK NFSG NBS BVI 
OOS 0TB NSHDiP Q^FRAN QSHERSH 
OOS 0TB NSHDIP QftFRAN QSHERSH 
OOP QHJ OTHCL 

AC ACLS BVA HW NBS NFSG NSHRL OB QHJ SR 
YWR 

AC ACLS BVA HW NBS NFSC NSHRL OB QHJ SR 
YVm 

AC YWR NSHRL BVA NBS HW OB SR QHJ NFSG 
AC YWR NSHRL BVA NBS NW OB SR QHJ NFSG 
OONL QHH OTV 



ERIC 
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LC/Canadiana CANUC 
No. No. 



Loeat ions 



3'»-289'»'i 001 



68-57116 
70-150656 

70- 153957 

71- 131065 

72- 8 

72-2i»99 

72-7033 
72-81380 
72-8i»3l8 
72-89619 



012 
005 
017 
016 
024 
009 

006 
010 
015 
003 



OOU OOGE CROC QMAY OTULS QMNBZ QQLAM QSHERSS 
BVAUW AE NFSQ NSHPD OOMR 
OOA OONL QMN QMU QHU BVIV 
OONL QhM OTU 

NSY OKiT OD QMQ SSU NSKS OK NFSG NBS BVI 

m. OOP ' OLK 

QMM 

AC ACLS BVA HW NBS NFSG NSHRL OB QHJ SR 
YWR 

OOP QMJ OTMCL 

AC YWR NSHRL BVA NBS MW OB SR Q^J NFSG 
OOS 0TB NSHDiP QHFRAN QSHERSH 
OOUM YWR NSDB 
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